Plots with ggplot2 are better plots!

Mireia Ramos (@mirthelle)

R-Ladies Barcelona, 27th of September 2017

About R-Ladies

What is R-Ladies?

Worldwide organization that promotes gender diversity in the R community via meetups and mentorship in a friendly and safe environment.

R-Ladies Barcelona

  • Created in November 2016 by Maëlle Salmon and Rebeca Huerga.
  • Season 2016-2017: 8 Meetups and more than 250 R-Ladies!

R-Ladies Barcelona v.2

  • Change of organizers summer 2017: Ania Alay and Mireia Ramos

More to come in this season 2017-2018!

Join us at Slack!

Introduction

Who am I?

  • PhD student in Biomedicine and Bioinformatics. I analyze genomic and epigenomic data using R to increase knowledge in the pathological bases of Type 1 Diabetes.
  • I have two cats whom I love. #rcatladies
  • I enjoy watching lots of TV series and playing videogames.

Why ggplot2?

ggplot2 is a data visualization package for R developed by Hadley Wickham that provides a structured approach to graphing.

Pros:

  • Standardized method for plotting.
  • Easy to create publication quality plot.
  • Allows creation of relatively complex plots with ease.
  • Most dominant statistics plotting package in R.

Why ggplot2?

Cons:

  • Might be a little bit difficult to understand at the beginning.
  • Some times you might get lost between all the things you can change.

Why ggplot2?

But in the end it’s worth it!

Basics of ggplot2

Basic elements

We find 4 basic elements in a ggplot2 plot:

  • Data. The data you want to show in the plot.
  • Geometries or geom_. Sets the type of graphic and elements that you want to plot.
  • Aesthetics or aes(). Maps a variable to an element in the plot (x and y axes, color, size, shape, etc.)
  • Scales or scale_. Allows you to select range of values to plot or map specific colors to factors.

Advanced elements1

  • Statistical transformations or stat_. Statistical summaries of the data that can be plotted, such as quantiles, fitted curves (loess, linear models, etc.), sums and so on.
  • Coordinate systems or coord_. The transformation used for mapping data coordinates into the plane of the data rectangle.
  • Facets or facet_. The arrangement of the data into a grid of plots (also known as latticing, trellising or creating small multiples).
  • Visual Themes or theme. The overall visual defaults of a plot: background, grids, axe, default typeface, sizes, colors, etc.

Steps for plotting

  1. Convert your data.frame to long format (not always needed!).
  2. Use ggplot() with your data and map your x and y values.
  3. Add geoms that you want to show and map other aesthetical parameters.

Steps for plotting

1. Convert your data.frame to long format

Examples of tables in wide format (left) and long format (right)
Drug Response.CTRL Response.T1D
A 3 12
B 6 56
C 8 2
Drug Patient Response
A CTRL 3
B CTRL 6
C CTRL 8
A T1D 23
B T1D 56
C T1D 2

reshape2::melt() useful for converting from wide to long formats.

Steps for plotting

2. Map X and Y values

library(ggplot2)

ggplot(dat.long, aes(x=Drug, y=Response))

Steps for plotting

3. Add geoms and other aesthetics

library(ggplot2)

ggplot(dat.long, aes(x=Drug, y=Response)) +
  geom_point(aes(color=Patient), size=3)

Useful resources

ggplot2 extensions

ggplot2 extensions

One of the great things about ggplot2 is that many people is developing extensions to further enhance the plotting potential of ggplot2.

ggplot2 extensions

Geometries

Geometries

Geometries (or geoms) allow you to select which kind of element you want to plot (lines, points, boxplots, bars, etc.).

You can include parameters to tweak the appearance:

  • Inside aes() parameters such as size, color or shape will be used to represent variables.
  • Outside aes() the same parameters will be static for your geom.

ggplot2 Quick Reference: geom

geom_point()

data("iris")

ggplot(iris, aes(Sepal.Length, Sepal.Width)) +
  geom_point(aes(color=Species), alpha=0.8, size=2) +
  geom_smooth(method="loess")

geom_boxplot()

ggplot(iris, aes(Species, Petal.Width)) +
  geom_boxplot(aes(color=Species), lwd=1)

geom_bar()

ggplot(iris, aes(Species, ..count..)) +
  geom_bar(aes(fill=Species), color="black", lwd=1)

geom_histogram()

ggplot(iris, aes(Sepal.Width)) +
  geom_histogram(aes(fill=Species), bins=30)

Combine geoms!

ggplot(iris) +
  geom_point(aes(Sepal.Length, Sepal.Width, shape=Species), color="dark orange") +
  geom_point(aes(Sepal.Length, Petal.Width, shape=Species), color="purple") +
  ylab("Width (cm)")

Geom extensions!

  • ggraph. Includes new geoms for plotting networks and relations.
  • ggrepel. Allows to add text notes to a plot avoiding overlaps.
  • ggimage. Instead of points (boring!) you can use images.

Scales

Scales

  • You can use scale_ to change the values associated to your variables in aes() or to determine the range of the data you want to show.

  • Make sure to always use the appropriate scale: if you use aes(color=variable), you will need to use scale_color_manual() to change the colors, but if you use aes(fill=variable), you will have to edit scale_fill_manual().

Scales extensions!

  • ggsci. Offers a collection of ggplot2 color palettes inspired by scientific journals, data visualization libraries, science fiction movies, and TV shows.

Themes

Themes

Themes determine the general look of your plot: background, font size, font type, etc. You can tweak each parameter separately using theme() function or you can use predefines themes from ggplot2.

Themes extensions!

  • cowplot
    • You can load it instead of ggplot2.
    • Different default theme().
    • Create images combining different plots, ready for publication! plot_grid()

Time for some exercises!

Final Remarks

Final Remarks

  • Hope this small tutorial/workshop was useful for you!

  • We are open to ideas for October Meetup –> If there are no suggestions we will probably will talk about why and how to create R packages.

  • If you have ideas for the upcoming meetups or you want to host one, contact us!

Final Remarks

  • We want in including Lighning Talks at the beginning of each Meetup:
    • 5 minutes presentation where someone explains their work and how they do it with R.
    • Very basic and simple and will allow us to see different areas where R is used.
    • If you are interested please contact us!

Thank you for coming!


  1. Extracted from “A Simple Introduction to the Graphing Philosophy of ggplot2” by Tom Hopper